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O ! Abstract 

Answer Set Programming (ASP) is a well-known problem solving approach based on nonmonotonic 
logic programs and efficient solvers. To enable access to external information, HEX-programs extend 
KH \ programs with external atoms, which allow for a bidirectional communication between the logic 

program and external sources of computation (e.g., description logic reasoners and Web resources). 
^ 1 Current solvers evaluate HEX-programs by a translation to ASP itself, in which values of external 

, atoms are guessed and verified after the ordinary answer set computation. This elegant approach does 

1 not scale with the number of external accesses in general, in particular in presence of nondeterminism 

(which is instrumental for ASP). In this paper, we present a novel, native algorithm for evaluating 
HEX-programs which uses learning techniques. In particular, we extend conflict-driven ASP solving 
techniques, which prevent the solver from running into the same conflict again, from ordinary to 
HEX-programs. We show how to gain additional knowledge from external source evaluations and 
\q ■ how to use it in a conflict-driven algorithm. We first target the uninformed case, i.e., when we have 

■ no extra information on external sources, and then extend our approach to the case where additional 
meta-information is available. Experiments show that learning from external sources can significantly 
decrease both the runtime and the number of considered candidate compatible sets. 

KEYWORDS: Answer Set Programming, Nonmonotonic Reasoning, Conflict-Driven Clause Learn- 
ing 
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1 Introduction 

Answer Set Programming (ASP) is a declarative programming approach (Niemela 1999; 
Marek and Truszczynski 1999; Lifschitz 2002), in which solutions to a problem correspond 
to answer sets (Gelfond and Lifschitz 1991) of a logic program, which are computed using 
an ASP solver. While this approach has turned out, thanks to expressive and efficient sys- 
tems like SMODELS (Simons et al. 2002), DLV (Leone et al. 2006), ASSAT (Lin and Zhao 
2004), cmodels (Giunchiglia et al. 2006), and CLASP (Gebser et al. 2012; Gebser et al. 
201 1), to be fruitful for a range of applications, cf. (Brewka et al. 201 1), current trends in 
distributed systems and the World Wide Web, for instance, revealed the need for access to 
external sources in a program, ranging from light-weight data access (e.g., XML, RDF, or 
data bases) to knowledge-intensive formalisms (e.g., description logics). 

To cater for this need, HEX-programs (Eiter et al. 2005) extend ASP with so called 
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external atoms, through which the user can couple any external data source with a logic 
program. Roughly, such atoms pass information from the program, given by predicates 
and constants, to an external source which returns output values of an (abstract) function 
that it computes. This extension is convenient and has been exploited for applications in 
different areas, cf. (Eiter et al. 2011), and it is also very expressive since recursive data 
exchange between the logic program and external sources is possible. Advanced reasoning 
applications like default reasoning over description logic ontologies (Eiter et al. 2008; Dao- 
Tran et al. 2009) or reasoning over Nonmonotonic Multi-Context Systems (Brewka and 
Eiter 2007; Eiter et al. 2010) take advantage of it. 

Current algorithms for evaluating HEX-programs use a translation approach and rewrite 
them to ordinary ASP programs. The idea is to guess the truth values of external atoms 
(i.e., whether a particular fact is in the "output" of the external source access) in a modified 
program; after computing answer sets, a compatibility test checks whether the guesses 
coincide with the actual source behavior. While elegant, this approach is a bottleneck in 
advanced applications including those mentioned above. It does not scale, as blind guessing 
leads to an explosion of candidate answer sets, many of which might fail the compatibility 
test. Furthermore, a blackbox view of external sources disables any pruning of the search 
space in the ASP translation, and even if properties would be known, it is sheer impossible 
to make use of them in ordinary ASP evaluation on-the-fly using standard solvers. 

To overcome this bottleneck, a new evaluation method is needed. In this paper, we thus 
present a novel algorithm for evaluating HEX-programs, described in Section 3, which 
avoids the simple ASP translation approach. It has three key features. 

• First, it natively builds model candidates from first principles and accesses external 
sources already during the model search, which allows to prune candidates early. 

• Second, it considers external sources no longer as black boxes, but exploits meta-knowledge 
about their internals. 

• And third, it takes up modern SAT and ASP solving techniques based on clause learn- 
ing (Biere et al. 2009), which led to very efficient conflict-driven algorithms for answer- 
set computation (Gebser et al. 2012; Drescher et al. 2008), and extends them to external 
sources, which is a major contribution of this work. To this end, we introduce external 
behavior learning (EBL), which generates conflict clauses (nogoods) after external source 
evaluation (Section 3). We do this in Section 4, first in the uninformed case (Section 4.1), 
where no meta-information about the external source is available, except that a certain in- 
put generates a certain output. We then exploit meta-information 1 about external sources 
(properties such as monotonicity and functionality) to learn even more effective nogoods 
which restrict the search space further (Section 4.2). 

We have implemented the new algorithm and incorporated it into the DLVHEX prototype 
system. 2 It is designed in an extensible fashion, such that the provider of external sources 
can specify refined learning functions which exploit specific knowledge about the source. 
Our theoretical work is confirmed by experiments that we conducted with our prototype on 
synthetic benchmarks and programs motivated by real-world applications (Section 5). In 



1 Not to be confused with semantically annotated data, which is not considered here. 
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several cases, significant performance improvements compared to the previous algorithm 
are obtained, which shows the suitability and potential of the new approach. 

2 Preliminaries 

In this section, we introduce syntax and semantics of HEX-programs and, following (Drescher 
et al. 2008), conflict-driven SAT and answer set solving. We start with basic definitions. 

A (signed) literal is a positive or a negated ground atom Ta or Fa, where ground atom a 
is offormp(ci, . . . , q), with predicate p and function-symbol free ground terms ci,. .. ,cg, 
abbreviated as p(c). For a literal a = Ta or a = Fa, let a denote its negation, i.e. 
Ta = Fa and Fa = Ta. An assignment A over a (finite) set of atoms A is a consistent set 
of signed literals Ta or Fa, where Ta expresses that a G A is true and Fa that it is false. 

We write A T to refer to the set of elements A T = {a | Ta G A} and A F to refer 
to A F = {a | Fa G A}. The extension of a predicate symbol q wrt. an assignment A 
is defined as ext(q,A) = {c | Tg(c) G A}. Let further A\ q be the set of all signed 
literals over atoms of form g(c) in A. For a list q = qi, . . . , q k of predicates, we let 
A| q = A| gi U---UA|, fc . 

A nogood {Li, . . . , L n } is a set of (signed) literals 1 < i < n. An assignment A is 
a solution to a nogood S resp. a set of nogoods A, iff 6 % A resp. 5 % A for all S G A. 

2.1 ftEX-Pro grams 

We briefly recall HEX-programs, which have been introduced in Eiter et al. (2005) as a gen- 
eralization of (disjunctive) extended logic programs under the answer set semantics (Gel- 
fond and Lifschitz 1991); for more details and background, we refer to Eiter et al. (2005). 

Syntax. HEX-programs extend ordinary ASP programs by external atoms, which enable a 
bidirectional interaction between a program and external sources of computation. External 
atoms have a list of input parameters (constants or predicate names) and a list of output 
parameters. Informally, to evaluate an external atom, the reasoner passes the constants and 
extensions of the predicates in the input tuple to the external source associated with the 
external atom, which is plugged into the reasoner. The external source computes an output 
tuple, which is matched with the output list. More formally, a ground external atom is of 
the form &g[p](c), where p = p\, . . . ,pk are constant input parameters (predicate names 
or object constants), and c = Ci , . . . , c; are constant output terms. 

Ground HEX-programs are then defined similar to ground ordinary ASP programs. 

Definition 1 (Ground HEX-programs) A ground HEX-program consists of rules of form 

ai V • • • V a k <- h, . . . , b m , not b m+1 , . . . , not b n , 
where each ai for 1 < i < k is a ground atom p{c\ , . . . , eg) with constants Cj, 1 < j < i, 
and each bi for 1 < i < n is either a classical ground atom or a ground external atom. 3 

The head of a rule r is H(r) — {ai, . . . , and the body is B(r) = {b\, . . . , b m , 



3 For simplicity, we do not formally introduce strong negation but see classical literals of form -^a as new atoms 
together with a constraint which disallows that a and ->a are simultaneously true. 
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not b m+ i, . . . , not &„}. We call 6 or not& in a rule body a default literal; B + (r) — 
{&i, . . . , b m } is the positive body, B~(r) = {6 TO +i, . . . , &«} is the negative body. 

In Sections 4 and 5 we will also make use of non-ground programs. However, we restrict 
our theoretical investigation to ground programs as suitable safety conditions allow for 
application of grounding procedure (Eiter et al. 2006). 

Semantics and Evaluation. The semantics of a ground external atom &<?[p](c) wrt. an 
assignment A is given by the value of a 1+fc+^-ary Boolean oracle function f& g that is 
defined for all possible values of A, p and c. Thus, &g [p] (c) is true relative to A if and only 
if it holds that f& g (A, p, c) = 1. Satisfaction of ordinary rules and ASP programs (Gelfond 
and Lifschitz 1991) is then extended to HEX-rules and programs in the obvious way, and 
the notion of extension ext(-, A) for external predicates &g with input lists p is naturally 
defined by ext(&g[p], A) = {c | / &fl (A,p,c) = 1}. 

The answer sets of a HEX-program IT are determined by the DLVHEX solver using a 
transformation to ordinary ASP programs as follows. Each external atom &g[p](c) in II 
is replaced by an ordinary ground replacement atom e &5 ,[ p ](c) and a rule e&g[p]{ c ) v 
ne &g[p]( c ) i s added to the program. The answer sets of the resulting guessing pro- 
gram II are determined by an ordinary ASP solver and projected to non-replacement atoms. 
However, the resulting assignments are not necessarily models of n, as the value of &<?[p] 
under f& 9 can be different from the one of e &g [ p ] (c). Each answer set of II is thus a can- 
didate compatible set (or model candidate) which must be checked against the external 
sources. If no discrepancy is found, the model candidate is a compatible set of U. More 
precisely, 

Definition 2 (Compatible Set) A compatible set of a program H is an assignment A 
(i) which is an answer set (Gelfond and Lifschitz 1991) of the guessing program II, and 
(H) /&g(A,p,c) = 1 iffTe&g[ p ](c) G A for all external atoms &g[p](c) in U, i.e. the 
guessed values coincide with the actual output under the input from A. 

The compatible sets of n computed by DLVHEX include (modulo A(n)) all answer sets 
of n as defined in Eiter et al. (2005) using the FLP reduct (Faber et al. 201 1), which we 
refer to as FLP-answer sets; with an additional test on candidate answer sets A (which is 
easily formulated as compatible set existence for a variant of n), the FLP-answer sets can 
be obtained. By default, DLVHEX computes compatible sets with smallest true part on the 
original atoms; this leads to answer sets as follows. 

Definition 3 (Answer Set) An (DLVHEX) answer set of H is any set S C {Ta \ a G 

A(H)} such that (i) S — {Ta | a G ^4(n)} fl A for some compatible set A of H and 
(ii) {Ta | a G -A (IT)} D A <£_ S for every compatible set A ofIL 

The answer sets in Definition 3 include all FLP-answer sets, and in fact often coincide 
with them (as in all examples we consider). Computing the (minimal) compatible sets is 
thus a key problem for HEX-programs on which we focus here. 

2.2 Conflict-driven Clause Learning and Nonchronological Backtracking 

Recall that DPLL-style SAT solvers rely on an alternation of drawing deterministic conse- 
quences and guessing the truth value of an atom towards a complete interpretation. Deter- 
ministic consequences are drawn by the basic operation of unit propagation, i.e., whenever 
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all but one signed literals of a nogood are satisfied, the last one must be false. The solver 
stores an integer decision level dl, written @dl as postfix to the signed literal. An atom 
which is set by unit propagation gets the highest decision level of all already assigned 
atoms, whereas guessing increments the current decision level. 

Most modern SAT solver are conflict-driven, i.e., they learn additional nogoods when 
current assignment violates a nogood. This prevents the solver from running into the same 
conflict again. The learned nogood is determined by initially setting the conflict nogood to 
the violated one. As long as it contains multiple literals from the same decision level, it is 
resolved with the reason of one of these literals, i.e., the nogood which implied it. 

Example 1 Consider the nogoods 

{Ta, T6}, {Ta, Tc}, {Fa, Tx, Ty}, {Fa, Tx, Fy}, {Fa, Fx, Ty}, {Fa, Fx, Fy} 
and suppose the assignment is A = {Fo@l, Tb@2, Tc@3, Ta;@4}. Then the third no- 
good is unit and implies Fy@4, which violates the fourth nogood {Fa, Tx, Fy}. As it 
contains multiple literals (x and y) which were set at decision level 4, it is resolved with 
the reason for setting y to false, which is the nogood {Fa, Tx, Ty}. This results in the 
nogood {Fa, Tx}, which contains the single literal x set at decision level 4, and thus is the 
learned nogood. 

In standard clause notation, the nogood set corresponds to 
(-.a V -16) A (-.a V -.c) A (a V ->x V -<y) A (a V ->x V y) A (a V x V -*y) A (a V x V y) 
and the violated clause is (a V ->x V y). It is resolved with (aV^iV ->y) and results in the 
learned clause (a V -*x). □ 

State-of-the-art SAT and ASP solvers backtrack then to the second-highest decision level 
in the learned nogood. In Example 1, this is decision level 1. All assignments after decision 
level 1 are undone (T6@2, Tc@3, Tx@4). Only variable Fa@l remains assigned. This 
makes the new nogood {Fa, Ti} unit and derives Fx at decision level 1. 



2.3 Conflict-driven ASP Solving 

In this subsection we summarize conflict-driven (disjunctive) answer-set solving (Gebser 
et al. 2012; Drescher et al. 2008). It corresponds to Algorithm HEX-CDNL without Part (c), 
(cf. Section 3, where we also discuss Part (c)). Subsequently, we provide a summary of the 
base algorithm; for details we refer to Gebser et al. (2012) and Drescher et al. (2008). 

To employ conflict-driven techniques from SAT solving in ASP, programs are repre- 
sented as sets of nogoods. For a program IT, let A (II) be the set of all atoms occurring in 
II, and let BA(U) = {B{r) \ r G 11} be the set of all rule bodies of IT, viewed as fresh 
atoms. 

We first define the set 7 (C) = {{FC} U {t£ \ £ e C}} U {{TC,U} \ £ e C} 
of nogoods to encode that a set C of default literals must be assigned T or F in terms 
of the conjunction of its elements, where t not a — Fa, ta = Ta, f not a = Ta, and 
fa = Fa. That is, the conjunction is true iff each literal is true. Clark's completion An of 
a program II over atoms A(TL) U BA (II) is the set of nogoods 

An = U crr (7( S W) U {{TB(r)} U {Fa | a e H(r)}}) . 
The body of a rule is true iff each literal is true, and if the body is true, a head literal must 
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also be true. Unless a program is tight (Fages 1994), Clark's completion does not fully 
capture the semantics of a program; unfounded sets may occur, i.e., sets of atoms which 
only cyclically support each other, called a loop. Avoidance of unfounded sets requires 
additional loop nogoods, but as there are exponentially many, they are only introduced 
on-the-fiy. 

Disjunctive programs require additional concepts. Neglecting details, it is common to 
use additional nogoods Q s h(n) derived from the shifted program sh(U), which encode the 
loop formulas of singleton loops; a comprehensive study is available in Drescher et al. 
(2008). 

With these concepts we are ready to describe the basic algorithm for answer set compu- 
tation shown in HEX-CDNL The algorithm keeps a set An U Q s h(n) of "static" nogoods 
(from Clark's completion and from singular loops), and a set V of "dynamic" nogoods 
which are learned from conflicts and unfounded sets during execution. While constructing 
the assignment A, the algorithm stores for each atom a e A(U) a decision level dl. The 
decision level is initially and incremented for each choice. Deterministic consequences 
of a set of assigned values have the same decision level as the highest decision level in this 
set. 

The main loop iteratively derives deterministic consequences using Propagation trying 
to complete the assignment. This includes both unit propagation and unfounded set prop- 
agation. Unit propagation derives d if 6 \ {d} C A for some nogood S, i.e. all but one 
literal of a nogood are satisfied, therefore the last one needs to be falsified. Unfounded set 
propagation detects atoms which only cyclically support each other and falsifies them. 

Part (a) checks if there is a conflict, i.e. a violated nogood 5 C A. If this is the case 
we need to backtrack. For this purpose we use Analysis to compute a learned nogood e 
and a backtrack decision level k. The learned nogood is added to the set of dynamic no- 
goods, and assignments above decision level k are undone. Otherwise, Part (b) checks if 
the assignment is complete. In this case, a final unfounded set check is necessary due to 
disjunctive heads. If the candidate is founded, it is an answer set. Otherwise we select a 
violated loop nogood S from the set A n (?7) of all loop nogoods for an unfounded set U 
(for the definition see Drescher et al. 2008), we do conflict analysis and backtrack. If no 
more deterministic consequences can be derived and the assignment is still incomplete, we 
need to guess in Part (d) and increment the decision level. The function Select implements 
a variable selection heuristic. In the simplest case it chooses an arbitrary yet unassigned 
variable, but state-of-the-art heuristics are more sophisticated. E.g., Goldberg and Novikov 
(2007) prefer variables which are involved in recent conflicts. 

3 Algorithms for Conflict-driven HEX-Program Solving 

We present now our new, genuine algorithms for HEX-program evaluation. They are based 
on Drescher et al. (2008), but integrate additional novel learning techniques to capture 
the semantics of external atoms. The term learning refers to the process of adding further 
nogoods to the nogood set as the search space is explored. They are classically derived 
from conflict situations to avoid similar conflicts during further search, as described above. 

We add a second type of learning which captures the behavior of external sources, called 
external behavior learning (EBL). Whenever an external atom is evaluated, the algorithm 
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Algorithm HEX-Eval 

Input: A HEX-program n 
Output: All answer sets of II 

ft <— n with ext. atoms Sg[p] (c) replaced by e^[ p ] (c) 

Add guessing rules for all replacement atoms to ft 

V -f— // set of dynamic nogoods 

r <— // set of all compatible sets 
while C ^ Ido (a) 
C «- _L 

inconsistent <— false 

while C — J_ and inconsistent — false do fb) 
A <-HEX-CDNL(n,ft,V) (e) 
if A — _L then inconsistent <— true 
else 

compatible true 

for all external atoms &g [p] In II do (d) 
Evaluate Sg[p] under A 
V <- V U A(&?[p], A) (e) 
LetA *[p](c) = 1 ^ Xe ^ [p](c) e A 

if 3c: /* s (A.p.c) ^ A*l p l< c ' then 
I Add A to V 

{compatible -f— false 

if compatible then C <— A 

if inconsistent — false then 

I // C is a compatible set of II 
|_V <- V U {C} and r <- r U {C} 

return C-minimal {{Ta e A | a e A(n)} A G T) 



Algorithm hex-CDNL 

Input: A program n, its guessing program ft, a set of correct 

nogoods Vofn 
Output: An answer set of ft (candidate for a compatible set 

of II) which is a solution to all nogoods d e V, or _L 

if none exists 

A <- // over A(ft) U SA(ft) U SA(s/i(ft)) 
dl / / decision level 

while frae do 
(A, V) <- Propagationfft, V, A) 

if <5 C A /or some <5 e A ft U 6 s(l U V then (a) 
if dZ — then return _L 
(e, fc) <- Analysis(5, ft, V, A) 
V <- V U {e} and d/ •*- k 
A ^ A\{o e A \ k < dl(o)} 
else if A T UA F =A(ft)uSA(ft)uSA(sr I (ft)) then (b) 
(7 <- UnfoundedSet(ft, A) 
if 1/ # then 

let 5 e A ft ((7) such that JCA 
if {o e S | < dl(o)} = then return J_ 
(e, fc) <- Analysis(<5, ft, V, A) 
V <- V U {e} and <- fc 
A^A\{itEA|):< 
else return A T n A(ft) 
else if Heuristic decides to evaluate &g [p] then <c) 
I Evaluate Sg[p] under A and set 
| V <- V U A(Sg\p], A) 
else <d) 
I cr -i- Select(ft, V, A) and di <- di + 1 
[A <- A o (ct) 



might learn from the call. If we have no further information about the internals of a source, 
we may learn only very general input-output-relationships, if we have more information 
we can learn more effective nogoods. In general, we can associate a learning-function with 
each external source. For the sake of introducing the evaluation algorithms, however, in 
this section we abstractly consider a set of nogoods learned from the evaluation of some 
external predicate with input list &g[p], if evaluated under an assignment A, denoted by 
A(&g[p], A). The next section will provide definitions of particular nogoods that can be 
learned for various types of external sources, i.e., to instantiate A(-, •). The crucial require- 
ment for learned nogoods is correctness, which intuitively holds if the nogood can be added 
without eliminating compatible sets. 

Definition 4 (Correct Nogoods) A nogood 5 is correct wrt. a program II, if all compatible 
sets of TL are solutions to S. 

In our subsequent exposition we assume that the program II is clear from the context. 
The overall approach consists of two parts. First, HEX-CDNL computes model candidates; 
it is essentially an ordinary ASP solver, but includes calls to external sources in order to 
learn additional nogoods. The external calls in this algorithm are not required for correct- 
ness of the algorithm, but may influence performance dramatically as discussed in Sec- 
tion 5. Second, Algorithm HEX-Eval uses Algorithm HEX-CDNL to produce model can- 
didates and checks each of them against the external sources (followed by a minimality 
check). Here, the external calls are crucial for correctness of the algorithm. 

For computing a model candidate, HEX-CDNL basically employs the conflict-driven ap- 
proach presented in Drescher et al. (2008) as summarized in Section 2, where the main dif- 
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ference is the addition of Part (c). Our extension is driven by the following idea: whenever 
(unit and unfounded set) propagation does not derive any further atoms and the assignment 
is still incomplete, the algorithm possibly evaluates external atoms (driven by a heuristic) 
instead of simply guessing truth values. This might lead to the addition of new nogoods, 
which can in turn cause the propagation procedure to derive further atoms. Guessing of 
truth values only becomes necessary if no deterministic conclusions can be drawn and the 
evaluation of external atoms does not yield further nogoods; guessing also occurs if the 
heuristic does not decide to evaluate. 

For a more formal treatment, let £ be the set of all external predicates with input list that 
occur in n, and let V be the set of all signed literals over atoms in A(TT) U A(Il) U BA(IV). 
Then, a learning function for II is a mapping A : £ x 2° h-> 2 2 . We extend our notion of 
correct nogoods to correct learning functions A (•,•), as follows: 

Definition 5 A learning function A is correct for a program II, iff all d 6 A(&j[p], A) are 
correct for II, for all &g[p] in £ and A e 2 V . 

Restricting to learning functions that are correct for II, the following results hold. 

Proposition 1 If for input IT, II and V, BEX-CDNL returns (i) an interpretation A, then 
A is an answer set oftl and a solution to V; ( ii) _L, then IT has no compatible set that is a 
solution to V. 

Proof (Sketch), (i) The proof mainly follows (Drescher et al. 2008). In our algorithm we 
have potentially more nogoods, which can never produce further answer sets but only elim- 
inate them. Hence, each produced interpretation A is an answer set of IT. (ii) By com- 
pleteness of Drescher et al. (2008) we only need to justify that adding A(<fcg[p], A) after 
evaluation of &g[p] does not eliminate compatible sets of IT. For this purpose we need to 
show that when one of the added nogoods fires, the interpretation is incompatible with the 
external sources anyway. But this follows from the correctness of A(-, •) and (for derived 
nogoods) from the completeness of Drescher et al. (2008). □ 

The basic idea of HEX-Eval is to compute all compatible sets of IT by the loop at (a) 
and checking subset-minimality afterwards. For computing compatible sets, the loop at (b) 
uses HEX-CDNL to compute answer sets of IT in (c), i.e., candidate compatible sets of IT, 
and subsequently checks compatibility for each external atom in (d). Here the external 
calls are crucial for correctness. However, different from the translation approach, the ex- 
ternal source evaluation serves not only for compatibility checking, but also for generating 
additional dynamic nogoods A(&<?[p], A) in Part (e). We have the following result. 

Proposition 2 UEX-Eval computes all answer sets ofIL 

Proof (Sketch). We first show that the loop at (b) yields after termination a compatible 
set C of n that is a solution of V at the stage of entering the loop iff such a compatible set 
does exist, and yields C = _L iff no such compatible set exists. 

Suppose that C ^ _L after the loop. Then C was assigned A^l, which was returned 
by HEX-CDNL(n, IT, V). From Proposition 1 (ii) it follows that C is an answer set of 
IT and a solution to V. Thus (i) of Definition 2 holds. As compatible — true, the for 
loop guarantees the compatibility with the external sources in (ii) of Definition 2: if some 
source output on input from C is not compatible with the guess, C is rejected (and added as 
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nogood). Otherwise C coincides with the behavior of the external sources, i.e., it satisfies 
(ii) of Definition 2. Thus, C is a compatible set of II wrt. V at call time. As only correct 
nogoods are added to V, it is also a compatible set of II wrt. the initial set V. 

Otherwise, after the loop C = _L. Then inconsistent — true, which means that the call 
HEX-CDNL(II, n, V) returned _L. By Proposition 1 (ii) there is no answer set of n which 
is a solution to V. As only correct nogoods were added to V, there exists also no answer 
set of n which is a solution to the original set V. Thus the loop at (b) operates as desired. 

The loop at (a) then enumerates one by one all compatible sets and terminates: the update 
of V with C prevents recomputing C, and thus the number of compatible sets decreases. 
As by Definition 3 the answer sets of II are the compatible sets with subset-minimal true 
part of original literals, the overall algorithm correctly outputs all answer sets of II. □ 

Example 2 Let Scempty be an external atom with one (nonmonotonic) predicate input p, 
such that its output is c if the extension of p is empty and c\ otherwise. Consider the 
program II e consisting of the rules 

p(c ). dom(c ). dom(c\). dom(c2)- p(X) <— dom(X),&empty\p](X) 
Algorithm HEX-Eval transforms II e into the guessing program II e : 

p(c ). dom(c Q ). dom{ci). dom(c 2 ). p(X) <- dom(X), e &empty ^ (X). 
e &em P ty[p](X) V ne &empty [ p ] (X) <- dom(X). 
The traditional evaluation strategy without learning will then produce 2 3 model candi- 
dates in HEX-CDNL, which are subsequently checked in HEX-Eval. For instance, the guess 
{Tne &empty[p] (c ), Te &empty[p] (ci), T ne &empty[p] (c 2 )} leads to the model candidate {T 

^ &empty[p] (^o)i 1 &empty[p] ( c l) : 

(neglecting false atoms and facts). This is also the only model candiate which passes the 
compatibility check: p(c ) is always true, and therefore e& empty [ p ] (ci) must also be true 
due to definition of the external atom. This allows for deriving p(ci) by the first rule of the 
program. All other atoms are false due to minimality of answer sets. □ 

The effects of the additionally learned nogoods will be discussed in Section 4 after 
having formally specified concrete A(&<?[p], A) for various types of external sources. 

4 Nogoods for External Behavior Learning 

We now discuss nogoods generated for external behavior learning (EBL) in detail. EBL is 
triggered by external source evaluations instead of conflicts. The basic idea is to integrate 
knowledge about the external source behavior into the program to guide the search. The 
program evaluation then starts with an empty set of learned nogoods and the preprocessor 
generates a guessing rule for each ground external atom, as discussed in Section 2. Fur- 
ther nogoods are added during the evaluation as more information about external sources 
becomes available. This is in contrast to traditional evaluation, where external atoms are 
assigned arbitrary truth values which are checked only after the assignment was completed. 

We will first show how to construct useful learned nogoods after evaluating external 
atoms, if we have no further information about the internals of external sources, called un- 
informed learning. In this case we can only learn simple input/output relationships. Subse- 
quently we consider informed learning, where additional information about properties of 
external sources is available. This allows for using more elaborated learning strategies. 
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Table 1 : Learned Nogoods of Example 3 



Guess 



Learned Nogood 



empty[p] ( c o)i ^ ne &empty[p] ( c l) 

Tne <fa , mpf!/ [p](c2) 

T e &em P ty[p] (co), Tne &mpt ,,[ p ] (ci) 
Te fcmpi!) [p] (c2),p(c2) 

(ci), 

Tne &mpt!/W (c2),p(ci) 

^S&emptylp] (Co), Te&empi,, [ p ] (ci ) , 

Te &mpt j[ p ] (c 2 ) , p(ci ) , p(c 2 ) 



{Tp(c ) , Fp(ci ) , Fp(c 2 ) , Fe &mj , ty[p] (ci ) } 
{Tp(co), Fp(ci), Tp(c 2 ), Fe &emptyW (ci)} 
{Tp(c ), Tp(ci), Fp(c 2 ), Fe &emptyW (ci)} 
{Tp(c ) , Tp(ci ) , Tp(c 2 ) , Fe &empty[p] (ci ) } 



4.7 Uninformed Learning 

We first assume that we do not have information about the internals and consider external 
sources as black boxes. Hence, we can just apply very general rules for learning: when- 
ever an external predicate with input list &g[p] is evaluated under an assignment A, we 
learn that the input A| p for p = pi, . . . ,p n to the external atom &g produces the output 
ext(&g[p], A). This can be formalized as the following set of nogoods. 

Definition 6 The learning function for a general external predicate with input list &g[p] 
in program IT under assignment A is defined as 

A s (&?[p],A) = {A| p U{F e<Sg[p] (c)} |ce ext(&g[p],A)} . 

In the simplest case, an external atom has no input and the learned nogoods are unary, 
i.e., of the form {Fe^j (c)}. Thus, it is learned that certain tuples are in the output of the 
external source, i.e. they must not be false. For external sources with input predicates, the 
added rules encode the relationship between the output tuples and the provided input. 

Example 3 (ctd.) Recall n e from Example 2. Without learning, the algorithms produce 
2 3 model candidates and check them subsequently. It turns out that EBL allows for falsifi- 
cation of some of the guesses without actually evaluating the external atoms. Suppose the 
reasoner first tries the guesses containing literal Te &empty ^ (co). While they are checked 
against the external sources, the described learning function allows for adding the exter- 
nally learned nogoods shown in Table 1 . Observe that the combination Tp(c ) , Fp(ci ) , Fp(c2) 
will be reconstructed also for different choices of the guessing variables. As p(co) is a fact, 
it is true independent of the choice between e &empty [ p ] (c ) and ne &empty [ p ] (co). E.g., the 
guess Fe &empty[p] (c ), Fe &empty[p] (a), Fe &empty[p] (c 2 ) leads to the same extension of p. 
This allows for reusing the nogood, which is immediately invalidated without evaluating 
the external atoms. Different guesses with the same input to an external source allow for 
reusing learned nogoods, at the latest when the candidate is complete, but before the ex- 
ternal source is called for validation. However, very often learning allows for discarding 
guesses even earlier. For instance, we can derive {Tp(c ), Fe &empty [ p ] (ci)} from the no- 
goods above in 3 resolution steps. Such derived nogoods will be learned after running into 
a couple of conflicts. We can derive Te < j. emptJ ,[p](ci) fromp(co) even before the truth value 
of Fe &em pty[p\ ( c i) is set > i.e., external learning guides the search while the traditional eval- 
uation algorithm considers the behavior of external sources only during postprocessing. □ 
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For the next result, let EE be a program which contains an external atom of form &g [p] ( • ) . 

Lemma 1 For all assignments A, the nogoods A g (&g[p], A) (Def. 6) are correct wrt. IT. 

Proof (Sketch). The added nogood for an output tuple c G ext(&g[p], A) contains A| p 
and the negated replacement atom Fe &g [ p ](c). If the nogood fires, then the guess was 
wrong as the replacement atom is guessed false but the tuple (c) is in the output. Hence, 
the interpretation is not compatible and cannot be an answer set anyway. □ 

4.2 Informed Learning 

The learned nogoods of the above form can become quite large as they include the whole 
input to the external source. However, known properties of external sources can be ex- 
ploited in order to learn smaller and more general nogoods. For example, if one of the 
input parameters of an external source is monotonic, it is not necessary to include informa- 
tion about false atoms in its extension, as the output will not shrink given larger input. 

Properties for informed learning can be stated on the level of either predicates or indi- 
vidual external atoms. The former means that all usages of the predicate have the property. 
To understand this, consider predicate Scunion which takes two predicate inputs p and q 
and computes the set of all elements which are in at least one of the extensions of p or q. 
It will be always monotonic in both parameters, independently of its usage in a program. 
While an external source may lack a property in general, it may hold for particular usages. 

Example 4 Consider an external atom &db{ri, . . . , r„, query](X.) as an interface to an 
SQL query processor, which evaluates a given query (given as string) over tables (rela- 
tions) provided by predicates n, . . . ,r n . In general, the atom will be nonmonotonic, but 
for special queries (e.g., simple selection of all tuples), it will be monotonic. □ 

Next, we discuss two particular cases of informed learning which customize the default 
learning function for generic external sources by exploiting properties of external sources, 
and finally present examples where the learning of user-defined nogoods might be useful. 

Monotonic Atoms. A parameter pi of an external atom &g is called monotonic, if f& g (A, p, c) = 
1 implies f& g (A', p, c) = 1 for all A' with A'\ Pi D A\ Pi and A'\ p > — A\ p > for all other 
p' / p.i. The learned nogoods A(&<?[p], A) after evaluating &g[p] are not required to 
include Fp^ti, ...,te) for monotonic p t e p. That is, for an external predicate with in- 
put list &g[p] with monotonic input parameters p m C p and nonmonotonic parameters 
p n = p \ p m , the set of learned nogoods can be restricted as follows. 

Definition 7 The learning function for an external predicate &g with input list p in pro- 
gram II under assignment A, such that &g is monotonic in p m C p, is defined as 

A m (&g[p],A) = {{Tae A| Pm }UA| Pn U{F e(S9[p] (c)} |cG ext{&g[p], A)} . 

Example 5 Consider the external atom &diff[p, q](X) which computes the set of all el- 
ements X that are in the extension of p, but not in the extension of q. Suppose it is evaluated 
under A, s.t. ext(p, A) = {Tp(a), Tp(b),Fp(c)} and ext(q, A) = {Fq(a), Tq(b),Fq(c)}. 
Then the output of the atom is ext(&diff[p, q], A) = {a} and the (only) naively learned 
nogood is {Tp(a), Tp(b),Fp(c), Fq(a),Tq(b), Fq(c),Fe &m[Ptg] (a)}. However, due to 
monotonicity of &diff[p,q] in p, it is not necessary to include Fp(c) in the nogood; the 
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output of the external source will not shrink even if p(c) becomes true. Therefore the 
(more general) nogood {Tp(a),Tp(b),Fq(a),Tq(b),Fq(c),Fe &diff[Ptq] (a)} suffices to 
correctly describe the input-output behavior. □ 

Functional Atoms. When evaluating &g [p] with some functional &g under assignment A, 
only one output tuple can be contained in ext(&g[p], A), formally: for all assignments A 
and all c, if f& g (A, p, c) = 1 then /& S (A, p, c') = for all c' ^ c. Therefore the follow- 
ing nogoods may be added right from the beginning. 

Definition 8 The learning function for a functional external predicate &g with input list p 
in program II under assignment A is defined as 

K f {&g[p],A) = {{Te&, [p] (c),T e(fe9[p] (c')} | c^c'} . 

However, our implementation of this learning rule does not generate all pairs of output 
tuples beforehand. Instead, it memorizes all generated output tuples c 1 , 1 < i < k during 
evaluation of external sources. Whenever a new output tuple c' is added, it also adds all 
nogoods which force previously derived output tuples c 1 to be false. 

Example 6 Consider the rules 

out(X) «— &concat[A, x](X) , strings(A), dom(X) 
strings(X) <— dom(X), not out(X) 

where &concat[a,b](c) is true iff string c is the concatenation of strings a and b, and 
observe that the external atom is involved in a cycle through negation. As the extension 
of the domain dom can be large, many ground instances of the external atom are gener- 
ated. The old evaluation algorithm guesses their truth values completely uninformed. E.g., 
e& C oncat(x, x, xx) (the replacement atom of &concat[A, x](X) with A = x and X = xx, 
where dom(x) and dom(xx) are supposed to be facts) is in each guess set randomly to 
true or to false, independent of previous guesses. In contrast, with learning over external 
sources, the algorithm learns after the first evaluation that e &concat (x, x, xx) must be true. 
Knowing that Scconcat is functional, all atoms e& concat (x, x, O) with O^xx must also 
be false. □ 

For the next result, let II be a program which contains an external atom of form &g [p] ( • ) . 

Lemma 2 For all assignments A, (i) the nogoods A m (&g[p], A) (Def. 7) are correct 
wrt. H and(ii) if&g is functional, the nogoods A/(&?[p], A) (Def. 8) are correct wrt. II. 

Proof (Sketch). For monotonic external sources we must show that negative input literals 
over monotonic parameters can be removed from the learned nogoods without affecting 
correctness. For uninformed learning, we argued that for output tuple c e ext(&g[p], A), 
the replacement atom e &g [ p ](c) must not be be guessed false if the input to &g[p](c) is A| p 
under assignment A. However, as the output of &g grows monotonically with the extension 
of a monotonic parameter p G p m , the same applies for any A' which is "larger" in p, i.e., 
{To e A'| p } D {Ta e A\ p } and consequently {Fa e A'| p } C {Fa e A| p }. Hence, the 
negative literals are not relevant wrt. output tuple c and can be removed from the nogood. 

For functional &g, we must show that the nogoods {{Te &s [ p ](c), Te& s [ p ](c')} c ^ c'} 
are correct. Due to functionality, the external source cannot return more than one output tu- 
ple for the same input. Therefore no such guess can be an answer set as it is not compatible. 
Hence, the nogoods do not eliminate possible answer sets. □ 
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User-defined Learning. In many cases the developer of an external atom has more in- 
formation about the internal behavior. This allows for defining more effective nogoods. 
It is therefore beneficial to give the user the possibility to customize learning functions. 
Currently, user-defined functions need to directly specify the learned nogoods. The de- 
velopment of a user-friendly language for writing learning functions is subject to future 
work. 

Example 7 Consider the program 

r(X, Y) V nr(X, Y) <- d(X), d(Y) 

r(V,W) <- &tc[r](V, W),d{V),d{W) 
It guesses, for some set of nodes d(X), all subgraphs of the complete graph. Suppose 
&£c[r] checks if the edge selection r(X, Y) is transitively closed; if this is the case, the 
output is empty, otherwise the set of missing transitive edges is returned. For instance, if 
the extension of r is {(a, 6), (b, c)}, then the output of &tc will be {(a, c)}, as this edge 
is missing in order to make the graph transitively closed. The second rule eliminates all 
subgraphs which are not transitively closed. Note that &tc is nonmonotonic. The guessing 
program is 

r(X,Y) V nr(X,Y) <- d(X),d(Y) 

r(V,W) <- e&tc[r] (V,W),d(V),d(W) 
e&tc[r] (V, W) V ne &tc[r] (V, W) «- d(V), d(W) 

n(n-l) 

The naive implementation guesses for n nodes all 2 2 subgraphs and checks the tran- 
sitive closure for each of them, which is costly. Consider the domain D = {a, b, c, d, e, /}. 
After checking one selection with r(a, b), r(b, c), nr(a, c), we know that no selection con- 
taining these three literals will be transitively closed. This can be formalized as a user- 
defined learning function. Suppose we have just checked our first guess r(a, b), r(b, c), 
and nr(x, y) for all other (x, y) G D x D. Compared to the nogood learned by the gen- 
eral learning function, the nogood {Tr(a, b), Tr(b, c),Fr(a, c), Ve & t c [r] ( a , c)} is a more 
general description of the conflict reason, containing only relevant edges. It is immediately 
violated and future guesses containing {Tr(a, 6), Tr(6, c), Fr(a, c)} are avoided. □ 

Example 8 (Linearity) A useful learning function for &diff[p,q](X) is the following: 
whenever an element is in p but not in q, it belongs to the output of the external atom. This 
user-defined function works elementwise and produces nogoods with three literals each. 
We call this property linearity. In contrast, the naive learning function from the Section 4. 1 
includes the complete extensions of p and q in the nogoods, which are less general. □ 

For user-defined learning, correctness of the learning function must be asserted. 

5 Implementation and Evaluation 

We have integrated CLASP into our reasoner DLVHEX; previous versions of DLVHEX used 
just DLV. In order to learn nogoods from external sources we exploit clasp's SMT inter- 
face, which was previously used for the special case of constraint answer set solving and 
implemented in the CLINGCON system (Gebser et al. 2009; Ostrowski and Schaub 2012). 
We compare three configurations: DLVHEX with DLV backend, DLVHEX with (conflict- 
driven) CLASP backend but without EBL, and DLVHEX with CLASP backend and EBL. 
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For our experiments we used variants of the above examples, the DLVHEX test suite, 
and default reasoning over ontologies. It appeared that learning has high potential to re- 
duce the number of candidate models. Also the number of total variable assignments and 
backtracks during search decreased drastically in many cases. This suggests that candidate 
rejection often needs only parts of interpretations and is possible early in the evaluation. 
All benchmarks were carried out on a machine with two 12-core AMD Opteron 6176 SE 
CPUs and 128 GB RAM, running Linux and using CLASP 2.0.5 and DLV Dec 21 201 1 as 
solver backends. For each benchmark instance, the average of three runs was calculated, 
having a timeout of 300 seconds, and a memout of 2 GB for each run. We report runtime 
in seconds; gains and speedups are given as a factor. 

Set Partitioning. The following program partitions a set S into two subsets Si , S2 Q S 
such that I Si I < 2. The partitioning criterion is expressed by two rules for Si = S \ S2 
and S2 = S\Si. The implementation is by the use of external atom Scdiff (cf. Example 5): 
dom(ci). ■ ■ ■ dom{c n ). 
nsel(X) dom(X),&diff[dom,sel](X). 
sel(X) <— dom(X), &diff[dom, nsel](X). 

<- sel(X), sel(Y), sel(Z), X ^ Y, X ^ Z,Y ^ Z. 
The results in Table 2a compare the run of the reasoner with different configurations for 
computing (i) all models resp. (ii) the first model. In both cases, using the conflict-driven 
CLASP reasoner instead of DLV as backend already improves efficiency. Adding EBL leads 
to a further improvement: in case (ii), the formerly exponentially growing runtime becomes 
almost constant. When computing all answer sets, the runtime is still exponential as expo- 
nentially many subset choices must be considered (due to the encoding); however, also in 
this case many of them can be pruned early by learning, which makes the runtime appear 
linear for the shown range of instance sizes. Moreover, our experiments show that the delay 
between the models decreases over time when EBL is used (not shown in the table), while 
it is constant without EBL due to the generation of additional nogoods. 

Default Reasoning over Description Logic Ontologies. We consider now a more re- 
alistic scenario using the DL-plugin (Eiter et al. 2008) for DLVHEX, which integrates 
description logics (DL) knowledge bases and nonmonotonic logic programs. The DL- 
Plugin allows to access an ontology using the description logic reasoner RacerPro 1.9.0 
(httpiWww.racer-systems.com/). For our first experiment, consider the program (shown left) 
and the terminological part of a DL knowledge base on the right: 

birds(X) <- DL[Bird]{X). Flier C -nNonFlier 

flies(X) <— birds(X), not neg_flies(X). Penguin C Bird 

neg-flies(X) «— birds(X) , DL[Flier l±l flies; -^Flier](X). Penguin C NonFlier 
This encoding realizes the classic Tweety bird example using DL-atoms (which is an al- 
ternative syntax for external atoms in this example and allows to express queries over 
description logics in a more accessible way). The ontology states that Flier is disjoint 
with NonFlier, and that penguins are birds and do not fly; the rules express that birds fly by 
default, i.e., unless the contrary is derived. The program amounts to the ^-transformation 
of default logic over ontologies to dl-programs (Dao-Tran et al. 2009), where the last rule 
ensures consistency of the guess with the DL ontology. If the assertional part of the DL 
knowledge base contains Penguin(tweety), then flies(tweety) is inconsistent with the 
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Table 2: Benchmark Results (runtime in seconds, timeout 300s) 



(a) Set Partitioning (b) Bird-Penguin 



# elements 

DLV 


all models 
CLASP CLASP 
w/oEBL wEBL 


DLV 


first model 
CLASP 
w/o EBL 


CLASP 
w EBL 


# individuals 

DLV 


CLASP 
w/o EBL 


CLASP 
w EBL 


1 


0.07 


0.08 


0.07 


0.08 


0.07 


0.07 


1 


0.50 


0.15 


0.14 


5 


0.20 


0.16 


0.10 


0.08 


0.08 


0.07 


5 


1.90 


1.98 


0.59 


10 


12.98 


9.56 


0.17 


0.56 


0.28 


0.07 


6 


4.02 


4.28 


0.25 


11 


38.51 


21.73 


0.19 


0.93 


0.63 


0.08 


7 


8.32 


7.95 


0.60 


12 


89.46 


49.51 


0.19 


1.69 


1.13 


0.08 


8 


16.11 


16.39 


0.29 


13 


218.49 


111.37 


0.20 


3.53 


2.31 


0.10 


9 


33.29 


34.35 


0.35 


14 




262.67 


0.28 


8.76 


3.69 


0.10 


10 


83.75 


94.62 


0.42 
















11 


229.20 


230.75 


4.45 


18 






0.45 


128.79 


62.58 


0.12 


12 






1.10 


19 






0.42 




95.39 


0.10 










20 






0.54 




91.16 


0.11 


20 






2.70 



(c) Wine Ontology (d) MCS 



Instance 


concept completion 


gain 


# contexts 








CLASP 


CLASP 


max 


avg 




DLV 


CLASP 


CLASP 




w/o EBL 


w EBL 










w/o EBL 


w EBL 


wine.O 


25 


31 


33.02 


6.93 


3 


0.07 


0.05 


0.04 


wine.l 


16 


25 


16.05 


5.78 


4 


1.04 


0.68 


0.14 


wine_2 


14 


22 


11.82 


4.27 


5 


0.23 


0.15 


0.05 


wine_3 


4 


17 


10.09 


4.02 


6 


2.63 


1.44 


0.12 


wine_4 


4 


17 


6.83 


2.87 


7 


8.71 


4.39 


0.17 


wine_5 


4 


16 


5.22 


2.34 










wine_6 


4 


13 


2.83 


1.52 










wine_7 


4 


12 


1.81 


1.14 










wine_8 


4 


4 


1.88 


1.08 











given DL-program (neg -flies (tweety) is derived by monotonicity of DL atoms and flies (tweety) 
loses its support). Note that defaults cannot be encoded in standard (monotonic) description 
logics, which is achieved here by the cyclic interaction of DL-rules and the DL knowledge 
base. 

As all individuals appear in the extension of the predicate flier, all of them are consid- 
ered simultaneously. This requires a guess on the ability to fly for each individual and a 
subsequent check, leading to a combinatorial explosion. Intuitively, however, the property 
can be determined for each individual independently. Hence, a query may be split into 
independent subqueries, which is achieved by our learning function for linear sources in 
Example 8. The learned nogoods are smaller and more candidate models are eliminated. 
Table 2b shows the runtime for different numbers of individuals and evaluation with and 
without EBL. The runs with EBL exhibit a significant speedup, as they exclude many 
model candidates, whereas the performance of the DLV and the CLASP backend without 
EBL is almost identical (unlike in the first example); here, most of the time is spent calling 
the description logic reasoner and not for the evaluation of the logic program. 

The findings carry over to large ontologies (DL knowledge bases) used in real-world 
applications. We did similar experiments with a scaled version of the wine ontology (http: 
/kaon2. semanticweb.org/download/test.ontologies. zip). The instances differ in the size of the 
ABox (ranging from 247 individuals in wine_0 to 20007 in wine_8) and in several other 
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parameters (e.g., on the number of concept inclusions and concept equivalences; Motik 
and Sattler (2006) describe the particular instances wine_z). We implemented a number of 
default rules using an analogous encoding as above: e.g., wines not derivable to be dry 
are not dry, wines which are not sweet are assumed to be dry, wines are white by default 
unless they are known to be red. Here, we discuss the results of the latter scenario. The 
experiments classified the wines in the 34 main concepts of the ontology (the immedi- 
ate subconcepts of the concept Wine, e.g., DessertWine and ItalianWine), which have 
varying numbers of known concept memberships (e.g., ranging from to 43, and 8 on 
average, in wine_0) and percentiles of red wines among them (from 0% to 100%, and 47% 
on average). The results are summarized in Table 2c. There, entries for concept comple- 
tion state the number of classified concepts. Again, there is almost no difference between 
the DLV and the CLASP backend without EBL, but EBL leads to a significant improvement 
for most concepts and ontology sizes. E.g., there is a gain for 16 out of the 34 concepts of 
the wine_0 runs, as EBL can exploit linearity. Furthermore, we observed that 6 additional 
instances can be solved within the 300 seconds time limit. If a concept could be classified 
both with and without EBL, we could observe a gain of up to 33.02 (on average 6.93). As 
expected, larger categories profit more from EBL as we can reuse learned nogoods in these 
instances. 

Besides O, Dao-Tran et al. (2009) describe other transformations of default rules over 
description logics. Experiments with this transformations revealed that the structure of the 
resulting HEX-programs prohibits an effective reuse of learned nogoods. Hence, the overall 
picture does not show a significant gain with EBL for these encodings, we could however 
still observe a small improvement for some runs. 

Multi-Context Systems (MCS). MCS (Brewka and Eiter 2007) is a formalism for inter- 
linking multiple knowledge-based systems (the contexts). Eiter et al. (2010) define incon- 
sistency explanations (IE) for MCS, and present a system for finding such explanations 
on top of DLVHEX. In our benchmarks we computed explanations for inconsistent multi- 
context systems with 3 up to 7 contexts. For each number we computed the average runtime 
over several instances with different topologies (tree, zigzag, diamond), which were ran- 
domly created with an available benchmark generator, and report the results in Table 2d. 

Unlike in the previous benchmark we could already observe a speedup of up to 1.98 
when using CLASP instead of the DLV backend. This is because of two reasons: first, CLASP 
is more efficient than DLV for the given problem, and second, CLASP was tightly integrated 
into DLVHEX, whereas using DLV requires interprocess communication. However, the most 
important aspect is again EBL, which leads to a further significant speedup with a factor 
of up to 25.82 compared to CLASP without EBL. 

Logic Puzzles. Another experiment concerns logic puzzles. We encoded Sudoku as a 
HEX-program, such that the logic program makes a guess of assignments to the fields and 
an external atom is used for verifying the answer. In case of a negative verification result, 
the external atom indicates by user-defined learning rules the reason of the inconsistency, 
encoded a pair of assignments to fields which contradict one of the uniqueness rules. 

As expected, all instances times out without EBL, because the logic program has no 
information about the rules of the puzzle and blindly guesses all assignments, which are 
subsequently checked by the external atom. But with EBL, the Sudoku instances could be 
solved in several seconds. 
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More details on the experiments and links to benchmarks and benchmark generators can 
be found at httpiWww.kr.tuwien.ac.at/research/systems/dlvhex/experiments.html. 



6 Discussion and Conclusion 

The basic idea of our algorithm is related to constraint ASP solving presented in Geb- 
ser et al. (2009), and Ostrowski and Schaub (2012), which is realized in the CLINGCON 
system. External atom evaluation in our algorithm can superficially be regarded as con- 
straint propagation. However, while both,Gebser et al. (2009) and Ostrowski and Schaub 
(2012), consider a particular application, we deal with a more abstract interface to external 
sources. An important difference between CLINGCON and EBL is that the constraint solver 
is seen as a black box, whereas we exploit known properties of external sources. Moreover, 
we support user-defined learning, i.e., customization of the default construction of conflict 
clauses to incorporate knowledge about the sources, as discussed in Section 4. Another dif- 
ference is the construction of conflict clauses. ASP with CP has special constraint atoms, 
which may be contradictory, e.g., T(X > 10) and T(X = 5). The learned clauses are sets 
of constraint literals, which are kept as small as possible. In our algorithm we have usu- 
ally no conflicts between ground external atoms as output atoms are mostly independent 
of each other (excepting e.g. functional sources). Instead, we have a strong relationship 
between the input and the output. This is reflected by conflict clauses which usually con- 
sist of (relevant) input atoms and the negation of one output atom. As in constraint ASP 
solving, the key for efficiency is keeping conflict clauses small. 

We have extended conflict-driven ASP solving techniques from ordinary ASP to HEX- 
programs, which allow for using external atoms to access external sources. Our approach 
uses two types of learning. The classical type is conflict-driven clause learning, which 
derives conflict nogoods from conflict situations while the search tree is traversed. Adding 
such nogoods prevents the algorithm from running into similar conflicts again. 

Our main contribution is a second type of learning which we call external behavior 
learning (EBL). Whenever external atoms are evaluated, further nogoods may be added 
which capture parts of the external source behavior. In the simplest case these nogoods 
encode that a certain input to the source leads to a certain output. This default learning 
function can be customized to learn shorter or more general nogoods. Customization is 
either done explicitly by the user, or learning functions are derived automatically from 
known properties of external atoms, which can be stated either on the level of external 
predicates or on the level of atoms. Currently we exploit monotonicity and functionality. 

Future work includes the identification of further properties which allow for automatic 
derivation of learning functions. We further plan the development of a user-friendly lan- 
guage for writing user-defined learning functions. Currently, they require to specify the 
learned nogoods by hand. It may be more convenient to write rules that a certain input to 
an external source leads to a certain output, in (a restricted variant of) ASP or a more con- 
venient language. The challenge is that evaluation of learning rules introduces additional 
overhead, hence there is another tradeoff between costs and benefit of EBL. Finally, also 
the development of heuristics for lazy evaluation of external sources is subject to future 
work. 
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